Visual Reasoning


Visual Reasoning over Time Series via Multi-Agent System

Add code
Feb 03, 2026
Viaarxiv icon

RegionReasoner: Region-Grounded Multi-Round Visual Reasoning

Add code
Feb 03, 2026
Viaarxiv icon

Bongards at the Boundary of Perception and Reasoning: Programs or Language?

Add code
Feb 03, 2026
Viaarxiv icon

Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling

Add code
Feb 03, 2026
Viaarxiv icon

VIRAL: Visual In-Context Reasoning via Analogy in Diffusion Transformers

Add code
Feb 03, 2026
Viaarxiv icon

SRA-Seg: Synthetic to Real Alignment for Semi-Supervised Medical Image Segmentation

Add code
Feb 03, 2026
Viaarxiv icon

DuoGen: Towards General Purpose Interleaved Multimodal Generation

Add code
Feb 03, 2026
Viaarxiv icon

IVC-Prune: Revealing the Implicit Visual Coordinates in LVLMs for Vision Token Pruning

Add code
Feb 03, 2026
Viaarxiv icon

FinMTM: A Multi-Turn Multimodal Benchmark for Financial Reasoning and Agent Evaluation

Add code
Feb 03, 2026
Viaarxiv icon

SwiftVLM: Efficient Vision-Language Model Inference via Cross-Layer Token Bypass

Add code
Feb 03, 2026
Viaarxiv icon